fix(core): require line-anchored frontmatter fences in file_utils#973
Merged
Conversation
has_frontmatter(), parse_frontmatter(), and remove_frontmatter() detected
frontmatter by substring/split (`content.startswith("---")` plus
`content.split("---", 2)`) rather than by line-anchored fences. A single-line
string that merely starts with `---` — e.g. `---\nstatus: active\n---\nBody`
where `\n` are literal backslash-n characters, a common CLI/agent input shape —
was misread as frontmatter: yaml parsed `\nstatus` as a key that got merged into
the note's YAML on disk, and the body was silently transformed.
Introduce a shared `_split_frontmatter()` helper that anchors both fences to
their own lines (`^---[ \t]*$`), tolerating leading blank lines so dedented
heredoc-style content still parses. All three public helpers now delegate to it,
preserving existing behavior for valid frontmatter (BOM stripping, empty -> {},
non-dict -> ParseError, and the existing ParseError messages callers assert on).
Closes #972
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Drew Cain <groksrc@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Drew Cain <groksrc@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #972
Problem
has_frontmatter(),parse_frontmatter(), andremove_frontmatter()insrc/basic_memory/file_utils.pydetected frontmatter by substring/split (content.startswith("---")pluscontent.split("---", 2)) rather than by line-anchored fences.Repro
A single-line string that merely starts with
---— where\nare literal backslash-n characters, a very common CLI/agent input shape:The CLI receives the literal one-line string
---\nstatus: active\n---\nDiscussed Q3 roadmap with Anthony.. The loose logic treated it as frontmatter:\nstatusas a key, which got merged into the note's YAML and written to disk.---…---segment was stripped).Fix
Shape chosen: (b) anchor the parsing ourselves. A new shared
_split_frontmatter()helper requires both fences on their own line (^---[ \t]*$); all three public helpers delegate to it, so detection is consistent.Why (b) over delegating to
python-frontmatter: the existing helpers raiseParseErrorwith specific messages that callers and tests assert on ("Content has no frontmatter","Invalid frontmatter format","Frontmatter must be a YAML dictionary","Invalid YAML in frontmatter"). Anchoring in-place keeps the diff minimal and preserves every one of those messages plus BOM stripping, empty-frontmatter ->{}, and non-dict ->ParseError.The helper skips leading blank lines before the opening fence, so dedented heredoc-style content (a string beginning with a newline) still parses exactly as before — this does not relax line-anchoring, since the single-line repro's first line is not a bare
---.All callers were reviewed (markdown/utils.py merge path, entity_service, sync_service, file_service, batch_indexer); none relied on the loose substring behavior.
Tests
tests/utils/test_file_utils.py: line-anchored detection for the exact one-line repro, inline---later in the first line, opening fence with no closing fence, and fences with trailing whitespace (valid); plus a pass-through regression confirmingremove_frontmatterleaves the one-liner intact andparse_frontmatterraises rather than inventing a\nstatuskey.tests/mcp/test_tool_write_note.py: integration-level regression throughwrite_note— the literal one-liner is stored verbatim as the body and no garbage key leaks into the generated YAML frontmatter.Gates
uv run pytest tests/utils/test_file_utils.py tests/mcp/test_tool_write_note.py— 90 passeduv run ruff check/ruff format --checkon changed files — cleanuv run ty check src— clean🤖 Generated with Claude Code